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An additive growth curve model with orthogonal design matrices is proposed in which obser- 
vations may have different profile forms. The proposed model allows us to fit data and then 
estimate parameters in a more parsimonious way than the traditional growth curve model. 
Two-stage generalized least-squares estimators for the regression coefficients are derived where 
a quadratic estimator for the covariance of observations is taken as the first-stage estimator. 
Consistency, asymptotic normality and asymptotic independence of these estimators are inves- 
tigated. Simulation studies and a numerical example are given to illustrate the efficiency and 
parsimony of the proposed model for model specifications in the sense of minimizing Akaike's 
information criterion (AIC). 

Keywords: AIC; asymptotic normality; consistent estimator; growth curve model; quadratic 
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1. Introduction 

In a variety of areas, observations are measured over multiple time points on a particular 
characteristic to investigate the temporal pattern of change on the characteristic. The 
observations of repeated measurements are usually analyzed by the grovi^th curve model 
(GCM), initiated by PotthofF and Roy [14]. Since then, parameter estimation, hypothesis 
testing and prediction of future values have been investigated by numerous researchers, 
generating a substantial amount of literature, including [2, 3, 7, 8, 10, 11, 15, 16]. The 
basic idea of the growth curve model is to introduce some known functions, usually 
polynomial functions, so as to capture patterns of change for time-dependent measure- 
ments. We shall generalize the growth curve model to the case where observations of 
time-dependently repeated measurements may have polynomial functions with different 
degrees rather than polynomial functions with a common degree. In this article, different 
profile forms mean polynomial functions with different degrees and a profile form means 
polynomial functions with a common degree. 
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To motivate it, let us look at the following situation. We have many groups of animals, 
with each group being subjected to a different treatment. Animals in all groups are 
measured at the same p time points and assumed to have the same covariance matrix 
S. The growth curve associated with the ith group is 6io + Out + 8i2t^ + • • • + Oiq.f^^ 
implying that the growth curves may have different profiles, say k profiles, not necessarily 
one profile. There are m,; groups that have the same profile form with index i and rij 
individuals in total. Here n = X^iLi '^i- The simplest situation is that each group has a 
different profile form. Assume that there are k groups of individuals and p observing time 
points such that k + p <n. For i = 1, 2, . . . , fc, put 

1 <i tj ... tf-^- 

1 <2 ^2 ■ • • ^2 ^ 

1 f +2 fQi-l 
©i ~ {(^iO, dil,9i2, . . . , 6',;<j._i), 

and 

Xi = {Xii ,Xi2,...,Xiny eM", 

where Xi(^p.+j) =1 for j = 1, 2, . . . , n,; with po = 0, pi = J2j=i other x[j 's are 0. 

Generalizing the above situation we propose the following additive growth curve model 

k 

r = ^x,e,z,' + f, £^e(o,/®E), (1) 

i=l 

with orthogonal design matrices or mutually orthogonal column spaces of design matrices, 
defined as 

rank(Xj)+p<n and "^(Xj) _L <^(Xj) or X-X^ = for any distinct (2) 

where Y is smnx p matrix of observations; Xi, Zi (1 < i < k) arc known nx rrii {n> rrii) 
full-rank design matrices and p x qi {p > qi) full-rank profile matrices, respectively; Oi 
(1 < i < fc) are unknown rrii x Qi matrices of the regression coefficients; "^^{X) denotes 
the column space of the matrix X; Q is a general continuous type distribution function; 
observations on individuals are independent; and the rows of the random error matrix 
£ are independent and identically distributed with mean zero and a common unknown 
covariance matrix E of order p. 

The model (1) subject to (2) will be demonstrated to have an advantage that it fits 
data in a more parsimonious way than the traditional growth curve model in the situ- 
ation where model specification is needed. In the above stated example of animals, the 
traditional growth curve model assumes that all observations have the same profile form, 
which may cause the model misspecification, underfitting or overfitting. 

On the other hand, KoUo and von Rosen [9], in Chapter 4, investigated an additive 
growth curve model with nested column spaces generalized by design matrices, that 
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is, constraint "^{Xi) D '^{X2) 2 • • • 2 '^{Xk) with rank(Xi) + p < ti, usually called the 
extended growth curve model. Obviously, there is not an inclusion relationship between 
the extended growth curve model and the proposed model (1) with (2) because the 
constraint of nested column spaces and the constraint of orthogonal column spaces have 
no inclusion relationship. An extension of the growth curve model proposed in [17] did 
not include the proposed model (1) with (2), either. 

This paper will investigate estimation of parameters and properties of the correspond- 
ing estimators in the proposed model (1) with (2), including consistency and asymptotic 
normality. 

The organization of the paper is as follows. Two-stage generalized least-squares esti- 
mators of the regression coefficients are obtained in Section 2. Both the consistency of 
the estimators for the regression coefficients and a quadratic estimator for the unknown 
covariance are investigated in Section 3, while their asymptotic normalities under cer- 
tain conditions are investigated in Section 4. Simulation studies are given in Section 5. 
A numerical example is explored to illustrate our techniques in Section 6. Finally, brief 
concluding remarks are stated in Section 7. 

Throughout this paper, the following notations are used. ^„xp denotes the set of 
all n X p matrices over real set R with trace inner product (,) and || • || denotes the 
trace norm on the set ^nxp- tr(A) denotes the trace of matrix A and /„ denotes the 
identity matrix of order n. For an n x p matrix Y , we write Y = [y'i, . . . ,y^]', € R^, 
where R^ is the p-dimensional real space and vec(y) denotes np-dimensional vector 
[yi, . . . , y„]'. Here the vcc operator transforms a matrix into a vector by stacking the 
rows of the matrix one underneath another. Y ^ G{fJ',I <8i S) means that Y follows a 
general continuous type distribution Q with E{Y) ~ /j, and E(F — iJ,)(Y — fi)' = / (g) S. 
The Kronecker product A(E) B oi matrices A and B is defined to be A(E) B = (aijB). 
Then we have vec{ABC) = {A^ C')vec{B). Let denote the Moore-Penrose inverse 
of A. Px = X{X'X)~ X' denotes the orthogonal projection onto the column space '^{X) 
of a matrix X. Mx = I — X{X'X)~ X' is the orthogonal projection onto the orthogonal 
complement 'if{X)^ oi'^{X). 

2. Two-stage generalized least-squares estimators 

Recall that the regression coefficients, 9i, . . . , 9^, in the model (1) are defined before a 
design is planned and observation Y is obtained. Thus the rows of the design matrices, 
Xi, . . . , Xk, are added one after another and the profile forms, Zi, . . . , Zk, do not depend 
on the sample size n. So, wc shall only consider the case of full-rank X^'s and Z^'s in the 
present paper. 



Set 



k 




(3) 



Equation (3) is said to be the mean structure of the model (1). 
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A statistic /igjg(y) is said to be the generalized least-squares (GLS) estimator of pa- 
rameter matrix fi if the minimum value of function {Y — fj,,Y — fi) is attained at the 
point u = /igjg(y), where the inner product (,) or the trace norm || - || associated with 
the covariance / (X) S of Y : (wi, W2) = vcc(w2)'(/ (8) S)^^ vec(wi) with || w ||= (w, w)-'/^ 
and w,wi, W2 € ^„xp- 

Generally speaking, we actually know nothing or very little about the covariance S of 
observations of repeated measurements before we measure these observations. So, alter- 
natively, a two-stage estimation is used to find an estimator of /x, denoted by /i2sgis(^)- 
The two-stage estimation procedure is as follows: First, based on the observation Y, find 
a first-stage estimator S of S. Second, replace the unknown S with the first-stage es- 
timator E and then find /i2sgis(^) through the GLS method. For convenience, we shall 
omit the subscript of A2sgis(^)- 

In order to get a good first-stage estimator E of E, let us have a close look at the 
following quadratic statistic (a quadratic form without associating with parameters): 

E(y) = Y'WY, = J: - 51 ^-^.^ ' 

where r = n — X^iLi rank(Xi). 

(1) The statistic E(y) is easily proven to be positive definite with probability 1; see 
Theorem 3.1.4 of [13]. So, T,-^{Y) exists with probability 1. 

(2) Under the assumption of normality, the quadratic estimator E(y) given by equa- 
tion (4) follows a Wishart distribution; see [4]. 

(3) E(y) is an unbiased invariant estimator of E; see [5]. A similar result for the 
growth curve model was obtained by Zezula [18]. 

It follows from the above properties that the statistic E(F) seems to be a very good 
candidate for the first-stage estimator. As a consequence, it will be taken as the first-stage 
estimator E in our subsequent discussion. 

For I = 1, . . . , fc, let 

H,{Y) = %-\Y)Z,,{Z'^-\Y)Z,)-^Z[ = %-\Y){Pz^%-\Y)Pzy . (5) 
Then, we easily see that 

Z[H,{Y) = Zl z = l,...,fc. (6) 

When E(y) is taken as the first-stage estimator, the following lemma provides the 
explicit expression of the two-stage GLS estimators both for mean matrix fx and the re- 
gression coefficients, 0i, . . . , Qk- Furthermore, under certain conditions, these estimators 
are unbiased. 

Theorem 2.1. Consider E ~ E(y) /or the model (1) subject to (2). The following state- 
ments hold. 
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(1) The two-stage GLS estimator fiiY) of fi is given by 

k k 

fi{Y) ^Y.PX'^^'^(^^(Pz,^'^{Y)Pz.y =Y,Px,YH,{Y). (7) 

(2) The two-stage GLS estimator Qi{Y) of Qi is given by 

e,(r) = {x[x,)-^x'jH,{Y)z,{z[z,r\ (8) 

(3) // the distribution of £ is symmetric about the origin 0, the statistic fiiY) is an 
unbiased estimator of mean fi. Moreover, for each i, the statistic Qi{Y) is an 
unbiased estimator of the regression coefficients &i . 

The proof of Theorem 2.1 is deferred to the Appendix. 



3. Consistency 

Since Y is associated with the sample size n, we shaU use Yn to replace Y in (4)- 
(8) and then investigate the consistency of the estimator 'S{Yn) and the estimators, 
9i(F), . . . , 8fc(y), as the sample size n tends to infinity. Note that X and £ arc also 
associated with the sample size n. 

Regarding the consistency of the quadratic estimator I](y„), we have the following 
result. 

Theorem 3.1. For the model (1) subject to (2), the statistic TiiYn) defined by (4-) is a 
consistent estimator of the covariance matrix E. 

Proof. It follows from the invariancc of statistic that J^{Y) = And can 

be rewritten as 



n — ni\ n ^ — ' " ^ — ' ' 



(9) 



1=1 



where m = X]i=i rank(Xi) and £ = {£i,.. .,£„)' t/(0,/„ (g) S). 

Note that {£i£'i)"^i is a random sample from a population with mean = E. 

Kolmogorov's strong law of large numbers tells us that 



1 " 

— } £i£'i converges almost surely to E. 

71 ^ ^ 



(10) 



1=1 



Let £ > 0. By Chebyshev's inequality and E{£'£) — tr(/)E, we have 



P 



k 



>e < 



tr f'^Px,^ 



^tr(E[^:^:']^P., 
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Since tri^^^^Pxi) = Y^\^iT:a.'ak{Xi) is a constant, P(|| ■^X^iLi^^i'^ 11^ tends to 
as the sample size n tends to infinity. So 



1 ^ 

— ^y^Px I? converges in probability to 0. (11) 



1=1 

Since convergence almost surely implies convergence in probability, by (10) and (11), we 
obtain from (9) that Y,{Yn) converges in probability to E, which completes the proof. □ 

Assumption 1. For I ^ 1, . . . ,k, 

lim n^^X'iXi^Ri, (12) 

n— >oo 

where Ri is positive definite. 

For convenience, we restate Lemma 3.2 of [6] as follows. 

Lemma 3.2. For i G {1, . . . , fc}, Hi{Yn) converges in probability to Hi = {Pz^ S^^P^J^ 

On the consistency of the estimators of the regression coefficients 8i(y„)s, we obtain 
the following theorem. 

Theorem 3.3. For any fixed i € {1, . . . , fc}, with Assumption 1, the statistic QiiYn) is a 
consistent estimator of the regression coefficient 8^ . 

Proof. Fix i € {1, . . . , fc}. By equation (8), we obtain the following equation: 

%^{Y^)^Q, + S,£H,{Y^)K,, (13) 

where Si = {X[X{)-'^X[ and K, = Z,{Z[Zi)-^ . The second term of the right side in (13) 
can be rewritten as 

S,EH,{Yn)K, = n{X[Xi)-^ (^^^0 (^^''"^) 

By condition (12), X'^j^fn arc bounded. In fact, the elements of X[/y/n are at most 
of order n^^/^ (see the proof of Lemma 4.1 below). So by (11), (12), Lemma 3.2 and 
Theorem 11.2.12 of [12], the second term of the right side in (13) converges in probability 
to 0. Thus, QiiYn) converges in probability to 8^, which completes the proof. □ 



In order to prove the consistency of the estimators 8i(y„), 82(i'n), ■ • ■ , 8fc(y„), the 



conditions lim„_).oo n ^X'.Xi = Ri for I = 1,.. .,k have been used in Theorem 3.3. We 
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imagine that for each new observation, a new row is added to the matrices Xi and that 
the earher rows remain intact in such a way that, for 1 = 1, 2, . . . , fc, the mi x mi elements 
of X[Xi are 0(n). In addition, we exclude the possibility that the limits of n~^X[Xis 
are singular. 

4. Asymptotic normality 

Wc have investigated the consistency of the estimators S(y) and 8i(F„) in the preceding 
section. In this section, we shall investigate the asymptotic normality of \/n[Qi{Yn) — <di\ 
and ^Jn^{Yn) — S] under certain conditions. 

We need the following lemma in the proof of the subsequent results. 

Lemma 4.1. Let Si ~ {X[Xi)^'^X[ = (sa, Sj;2, . . . , Sin)mixn, where is the jth column 
of Xi. Then, under condition (12), the mi elements of y/nSij are 0(n~^/^) for any 
iG{l,...,fc} and j Cz {1., ■ . ■ ,n} . 

The proof of Lemma 4.1 is deferred to the Appendix. 

Theorem 4.2. Under Assumption 1, the random matrix ^JnSiS converges in distribu- 
tion to A/'miXp(0, ® S) for any i S {1, . . . , fc}. 

Also, the proof of Theorem 4.2 is deferred to the Appendix. 

Finally, by Theorem 4.2 and Slutsky's theorem, we obtain our main result on the 
asymptotic normality of \/n[Qi{Yn) — ©i]- 

Theorem 4.3. Under Assumption 1, the statistic \/n[Qi{Yn) — 6i] converges in distri- 
bution to Nruixqi (0,Ri(g) (Zl'SZi)"^) for any i G {1, . . . , fc}. 

Next, we shall investigate the asymptotic normality of the The fourth-order 

moment of the error matrix will be needed in the following discussion. 

Assumption 2. E(fi) = 0, E(fif() = S > 0, ¥j{Ei®EiE[) = 0^2 and < oo, 

where £[ is the first row vector of the error matrix £ . 

Theorem 4.4. Under Assumptions 1 and 2, the following probability statements hold: 

(a) y/n{Y,{Y) ^ Yi) converges to M{0,Cov{£[ <S5 £[)) in distribution. 

(b) For each i, ^Jn[Yj{Y) — E) and y/n[Qi{Y) — 0,) are asymptotically independent. 

(c) For any distinct i, j , ^/n{Qi{Y) — Qi) and ^yn{Qj{Y) — Oj) are independent. 



Proof, (a) ^/n{Y,{Y) — E) can be decomposed into 

V^(S(r)-I])-Ai+A2 + A3, 
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—=- Ei£[ , 

Til ' ^ 



n — ra 

1=1 



Similar to the proof of conclusions (10) and (11) in Theorem 3.1, we easily obtain that 
A2 and A3 converges to in probability 1. 

Also by assumptions 1 and 2, Ai converges to A/'(0, $2) in distribution, where $2 = 
Cov(£( ®£[). Thus, we have 

v^vec(E(y) - S) = vec(Ai) + op(l). 

Hence, -Jn{Tj{Y) — E) converges to A/'(0, Cov(£( ® £[)) in distribution. 

(b) By equation (13), it suffices to prove the asymptotic independence between 
-^Yec{X[E) and v^vec(S(y) - S). 

Let Q„ = X[£ = (xl, . . . ,xj,)(£i, . . .,£„)'. Then 

Cov(^(^-^X[£yV^{j:-^)^^Co^(^{^^^^ +op(l) 

According to Assumption 2, Gov{{-^X[£), '/n{Ti{Y) — S)) converges to in probability 

1. It follows that the vectors -^^cc{X[£) and ■y/nvec(S(F) — S) are asymptotically 

independent. Therefore, y/n{T.{Y) — E) and ^Jn{Qi{Y) — 8^) also are asymptotically 
independent. 

(c) For any distinct it follows from condition (2) that 

Cov(V^(e,(r) - e,),^/^(e,(r) - e,)) = o. 

We have completed the proofs of statements (A)-(C). □ 



Sometimes, it is necessary to consider hypothesis tests of the form 

H, : CQ,V' = 0, 



1408 



J. Hu, G. Yan and J. You 



where C and V are, respectively, s x rrii and t x qi constant matrices. In this case, 
Theorem 4.3 and Slutsky's theorem are explored to understand the asymptotic behavior 



of v^(ce,(r)r-ce,r). 

Corollary 4.5. Under Assumption 1, if matrices C{X' X)-^C' andV {Z'T.-^{Y)Z)"^V' 
are non-singular, then the statistic 



under Hi converges in distribution to AfsxtXO, I). 

Therefore, if it is necessary to make the statistical inference about certain Qi in the 
model (1), wc can take the normal distribution Msxt{0, 1) as an approximate distribution 
of the statistic 



under H, if the sample size is large. Moreover, due to (c) of Theorem 4.4, and Hj can 
be considered independently. 

5. Simulation studies 

In this section, we shall use simulation to investigate the efficiency and parsimony of the 
model (1) subject to constraint (2), compared with the traditional growth curve model 

Y = xez' + £. 

We take an example as follows. Suppose n patients are divided into two groups with 
numbers of patients ni and n2, respectively. A certain measurement in an active drug 
trial is made on each of the ni patients taking a placebo and the n2 patients taking the 
active drug at time points ti = —1, t2 = —0.5, t^ ~ 0.5 and t4 ~ 1. Assume that the first 
ui observations come from normal distribution A/'(/Xx,So), where 



(The model with this So is called the simplest serial correlation model in literature.) It 
means that the ui observations have a linear profile form of time points. The remaining 
n2 observations come from normal distribution Af{fJ,2, Sq), where 



{Cn{X'X)-^C')-^^^V^iCe^{Y)V'){V{Z'^'\Y)Z)-W)-^/^ 



{CniX'X)-^C')-^/^V^{C&^iY)V')iViZ'^~^iY)Z)~W)-^/^ 



= (4 + 2^1 , 4 + 2^2 , 4 + 2^3, 4 + 2^4) 



and 




/X2 = (3 + 2ti +tl-tl,3 + 2t2 + tl-tl,3 + 2t3 + tl-tl,3 + 2ti + tl- tl), 
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implying that the 712 observations have a cubic polynomial profile form of time points. 
Let 



1 

t2 



1 

^3 



and 



/I 

tl 



1 1 

t2 h 



4-2 f2 
\tl tl 



'^3 



tlJ 



and 



then 



Bi = {4 2), B2 = (3 2 -3 2), 



fii = BiZ[ and = -82-^2 ■ 

In real experiments, with observations Y , model specification is a challenging task. 

Without loss of generality, we shall consider three approaches using the growth curve 
model to fit data of repeated measurements from the above synthetic example. 

The first approach is to regard all observations of repeated measurements as having 
linear profile forms over multiple time points. In this scenario, model underfitting has 
occurred. The underfitted model is denoted by i/'u, 

Mode\ip^:Y = Xe^Z[+£, 

where X = °) and 8u = {gl} gl?) to fit the n observations. By (b) of Lemma 2.1, 

the estimator §„ = {X'X)-^X'YY:-^{Y)Zi{Z[^:~^{Y)Zi)-\ 

The second approach is to regard all observations of repeated measurements as fol- 
lowing cubic polynomial profile forms over multiple time points. In this case, model 
ovcrfitting has occurred. The ovcrfitted model is denoted by ipo, 



Model^po --Y = XQoZ2+£, 

where Qo = ( a" oi'^ a" ai* ) to fit the n observations. The estimators of regression cocf- 

ficicnts arc §0 = (X'X)-iX'yE-i(y)Z2(Z^S-i(F)Z2)"^ 

The third approach is to regard the first ni observations as having a linear profile form 
of time points and the remaining 712 observations as having a cubic polynomial profile 
form over multiple time points. In this case, model misspccification may not occur. The 
additive model is denoted by i/'a, 



Model V-a : Y = XiQiZ[ + ^262^^ + £, 

where Xi = C^^), X2 = (1°^), 61 = {0n ^12) and 62 = (^21 ^22 023 ^24), to fit the 
n observations. Based on the above assumption, ipi^ actually is the true model. The 
estimators 8, = {XlXi)-^XlYJ:-^Y)Z,{ZlY.-^Y)Z,)-^ for i = 1,2. 
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The model specification starts with residuals. We shall use a residual matrix R, defined 
as the difference between the observation Y and fitted mean F, that is, R = Y — Y , to 
discuss the model specification for our example. 

The residual matrix sum of squares (RMSS) is defined as the trace of R' R 

RMSS = \\Rf = tr((r - Y)'{Y - Y)). (14) 

Usually, overfitting of model specification can provide a smaller RMSS as well as use 
more parameters. The residual matrix sum of squares and the number of parameters are 
two trade-off issues in model specification. Here, Akaike's information criterion (AIC) - 
see [1] - is explored to reward the decreasing RMSS and penalize overparametrization. 
Akaike's information criterion formula is given by 

AIC = nln(RMSS) + 2(p + 1) - nln(n). 

Specially, we have the following three AICs for the above chosen three models. 

AIC„ - 7dn(tr((y - X%uZ'^)'{Y - X%uZ[))) + 2(p„ + 1) - nln(n), 

AICo = nln(tr((y - XQaZ'^YiY - XQoZ'^))) + 2{po + 1) - nln(n) 

and 

klCa = n\n{iT{{Y-XieiZ[-X2Q2Z'2)'{Y-XieiZ[-X2Q2Z'2))) + 2{pa + l)~n\n{n), 

where Po are the numbers of parameters in 0^, 0o, respectively, and Pa is sum of 
the numbers of parameters in 0i and 02. 

In our simulation, consider ni =n2 = ?i/2, replication times = 10 000 and p = 0.2, 0.5 
and 0.8, respectively. 

With A^ replication times, the average values of AIC„, AICo and AICa are denoted by 

N N N 

aic(^J = ^Eaic^, AIC(^o) = -^AIC^, AIC(^J = -^AICL. 

i—1 i—1 i—1 

The relations between the sample size n and AIC(-0„), AIC(-0o) and AlC(V'a) are 
illustrated in Figures 1-3 for p — 0.2, 0.5 and 0.8, respectively. We can make the following 
conclusions from these curves: 

(1) Akaike's information criterion of the true model 4'a remains to be uniformly small- 
est for all cases of p = 0.2,0.5 and 0.8. The trend becomes particularly obvious as the 
sample size n increases. We believe that the conclusion is true for all p £ (0, 1). 

(2) The curve for AIC of the true model tjja and the curve for AIC of the overfitted 
model ipo are parallel. It implies that the difference between AIC for the true model 
and AIC of the overfitted model ipo is a constant. The constant is due to the penalty for 
overparametrization. This shows that it is not significant for the difference between the 
RMSS for the true model ipa and the RMSS for the overfitted model. Overfitting gets a 
penalty for overparametrization and leads to a bigger AIC. 
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Figure 1. AIC(V'ii), AIC(V'o), AIC(i/'a) and sample size n for p = 0.2. 



(3) The underfitted model V'm seems to have a bigger AIC than the overfitted model. 
It means that underfitting incurs more loss than overfitting does in the terms of AIC. 
The loss becomes larger and larger as the sample size increases or p is closer and closer 
to 1. 

(4) The curve for AIC of the underfitted model becomes a little bit steeper as p is 
gradually close to 0, while the curve for AIC of the overfitted model and the curve for 
AIC of the true model seem to be unrelated to p. 

In conclusion, using the additive growth curve model (1) with orthogonal design ma- 
trices has an obvious advantage over using the traditional growth curve model in model 
specification and then in parameter estimation. 
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Figure 2. AlC{il'u), AIC(i/'o), AlC{ipa) and sample size n for /9 = 0.5. 
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Plot of Average AlC for rho=0.8: — Underfitting, ... Overfitting, **' Additive 
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Figure 3. AIC(V'u), AIC(V'd), AlC(V'a) and sample size n for p = 0.8. 



Table 1. Measurements on 11 girls and 16 boys, at 4 different ages -8, 10, 12, 14 



Girls 


8 


10 


12 


14 


Boys 


8 


10 


12 


14 


1 


21 


20 


21.5 


23 


1 


26 


25 


29 


31 


2 


21 


21.5 


24 


25.5 


2 


21.5 


22.5 


23 


26.5 


3 


20.5 


24 


24.5 


26 


3 


23 


22.5 


24 


27.5 


4 


23.5 


24.5 


25 


26.5 


4 


25.5 


27.5 


26.5 


27 


5 


21.5 


23 


22.5 


23.5 


5 


20 


23.5 


22.5 


26 


6 


20 


21 


21 


22.5 


6 


24.5 


25.5 


27 


28.5 


7 


21.5 


22.5 


23 


25 


7 


22 


22 


24.5 


26.5 


8 


23 


23 


23.5 


24 


8 


24 


21.5 


24.5 


25.5 


9 


20 


21 


22 


21.5 


9 


23 


20.5 


31 


26 


10 


16.5 


19 


19 


19.5 


10 


27.5 


28 


31 


31.5 


11 


24.5 


25 


28 


28 


11 


23 


23 


23.5 


25 












12 


21.5 


23.5 


24 


28 












13 


17 


24.5 


26 


29.5 












14 


22.5 


25.5 


25.5 


26 












15 


23 


24.5 


26 


30 












16 


22 


21.5 


23.5 


25 


Mean 


21.18 


22.23 


23.09 


24.09 


Mean 


22.87 


23.81 


25.72 


27.47 



6. A numerical example 

The numerical example, stated in [14], about a certain measurement in a dental study 
on 11 girls and 16 boys at 4 different ages - 8, 10, 12 and 14 - is employed here (see 
Table 1) to illustrate the ideas and techniques stated in the paper. 
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Table 2. Parameter pair {g,b) and AICs for 9 models 



(5, b) 


AIC 


(9, d) 


AIC 


(3, h) 


AIC 


(1, 1) 


90.4011* 


(1, 2) 


92.2497 


(1,3) 


94.1817 


(2, 1) 


92.4009 


(2, 2) 


94.2495 


(2, 3) 


96.1815 


(3, 1) 


94.3972 


(3, 2) 


96.2458 


(3, 3) 


98.1777 



Prior to making the model specification, we do not know whether the distances, in 
miUimeters, from the center of the pituitary to the pteryo-maxihary fissure of these girls 
and boys follow two polynomial functions of time t with a same degree. So we assume 
that the distances for girls and for boys follow two polynomial functions of time t with 
different degrees g and b (set 1 < 5, < 3). 

Based on the model (1), wc think of these observations as realizations of the following 



model: 



where 



1 




1 


''2 


t^ 

'■3 


t^ 

1-4 



and 

X2-f,° ), 62 = (021 ••• 02b), K=\ ■ ■ ■ - I forl<6<3. 



16 



1 


1 


1 


t'' 

'^1 


t'' 


t'' 



'•4 , 



We should trade the effect of the RMSS from the simple "true" model and the loss 
from overparameterization. Due to setting 1 < 5, & < 3, we can structure nine models for 
selection. The corresponding AICs of the nine models are displayed in Table 2. 

The best model is the one with the minimum AIC. Based on AIC, the model with 
parameter pairs (1, 1) is best, that is, the growth curves for girls and boys are two linear 
equations of time t. Our conclusion of model specification is consistent with the chosen 
model of [14]. 



7. Concluding remarks 

When observations of a repeated measurement at multiple time points follow polynomial 
functions with different degrees rather than the same degree, using the traditional growth 
curve model may cause underfitting or overfitting. To avoid these troubles, wc proposed 
an additive growth curve model (1) with orthogonal design matrices that allows us to fit 
the data and then estimate parameters in a more parsimonious and efficient way than the 
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traditional growth curve model. Obviously, the proposed additive growth curve model 
can not be included in the extended growth curve models investigated by KoUo and von 
Rosen [9], Chapter 4, and Verbyla and Venables [17]. 

In the paper, we explored the least-squares approach to derive two-stage GLS estima- 
tors for the regression coefficients, where an invariant and unbiased quadratic estimator 
for the covariance of observations is taken as the first-stage estimator. We investigated 
the properties of these estimators, including unbiasedncss, consistency and asymptotic 
normality. 

Simulation studies and a numerical example are given to illustrate the efficiency and 
parsimony of the proposed model for model specification in the sense of minimizing AIC 
compared to the traditional growth curve model. It follows that our additive growth 
curve model and the least-squares estimation for regression coefficients arc competitive 
alternatives to the traditional growth curve model. 

Appendix 

Proof of Theorem 2.1. Put T = [Xi® Zi, . . . ,Xk® Zk) and (5 = ((vec(ei))', . . . , (vec(efc 

Then the model (1) can be rewritten as vec(/x) = TjS where "^(T) = "^{Xi ® Zi) A h 

^{Xk®Zk). 

(a) To obtain the two-stage generalized least square estimate of /x from the data Y is 
equivalent to applying the ordinary least square method to the following model 

vec(Z) = vec(iy) + {I ® T.~^''^{Y)) vec(f ), (15) 

where vec(Z) = {I ®Y.-^/^{Y))Yec{Y) and vec(i^) = [I ®Y.-^/^{Y))T(3. So the ordinary 
least square estimator of vec(i>') is 

vec(i>ois(Z) =^'(/,gE-i/2(y))Tvec(Z). (16) 

Thus by equations (15) and (16) 

vec(A(r)) = (/®Si/2(r))P(,^s-v.(y))Ta®S-i/2(r))vec(r). (17) 

Since 

W-v2(y))T = {i® ^-'^\Y))T{T'{i ® j:{y))-'t)+t'{i <e> E-i/2(r)), (18) 

by equations (17) and (18), we obtain 

vec{fi{Y)) = T{T'{I ® j:{Y)y^T)+r{I ® T.{Y)y^ vec(r). 
By Kronecker product operations and (7), vec(/i(F)) reduces to 

k 

vec(A(F)) = Y^iMx^x^rxl ® z,(z,'s-i(r)z,)+z,'i]-i(r)} vec(r). (19) 
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Since {Z,{Z^Z,)-ZIJ:-^{Y)Z,{ZIZ,)~Z^)+ = Z,{ZIT.-\Y)Z,)+Zl, in matrix language, 
we obtain the expression (7) of fi{Y) by rewriting (19). 

(b) It follows from equation (3) and the condition (2). 

(c) To prove the unbiasedness of G^'s, by (4) and (7), it suffices to show that fi{Y) is 
an unbiased estimator of /x. 

Since T,(Y) = E(f ) = T,{-£), H,{-8) = H,{£) and F.{EH,{£)) = 0. By (5), (i{Y) can 
be expressed as 

i—1 i—1 i—1 

And 

E(A(r)) = Y + PxM£H,{E)) = Y x,Q,z[ = 

i—1 i—1 i—1 

completing the proof. □ 

Proof of Lemma 4.1. Some subscript i's are ignored in the following statements. Write 
Vi, . . . ,v„]. The transpose of v^- is an mi-element row vector as follows, 

^aji,...,^aj™j, 

where Xi = [aij\nxmi - By (12), VV = n^^X[Xi converges to a positive definite matrix 
Ri. So the elements of VV — vivi' + • • • + v„v^ are bounded. We claim that, for any 
j £ {1, . . . the rrii elements of Vj are all 0(n~^/^). 

If this is not true, we can assume without loss of generality that one element of v„ 
is 0{nP~^/'^) with p > 0. Then one element of v„v^ would be 0(n^^~^). Hence, the 
corresponding element in matrix VV = viv'j^ + • • • + v„v^ would be 0{n?'P), which is not 
bounded. This is a contradiction to condition (12). 

Since 

(V^s,i, . . . , V^s„) = ^(X[X,)-^X[ = n{X[X,)-^^X[ 

= n(X,'X,)-Mvi,...,v„], 

namely, for j = 1, . . . , n, y/nSij ~ n{X[Xi)^^Wj. Thus, for j = 1, . . . , n, the mi elements 
of y/nSij are also 0(n^^/^), completing the proof. □ 

Proof of Theorem 4.2. Fix i. Let Ti = Si£ E ^niixp- Then Ti can be rewritten as 

n 

Ti = ^Y^ ^ij^jJ 
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where Sij is the jth column vector of S,; and is the jth row vector of the matrix £ 
with £-5(0, 

Since are independent and identically distributed, for t e ^m^xp^ the charac- 

teristic function ^'ri(t) of ^JnVi is given by 

^'„(t) =E(exp{itr(^^t'^,)}) =E^cxp|itr^^/^t'^s,J£;^ |^ 

= E|^exp|itr|^\/H^t'sy£j^ |^ = ]J $(\AIt's,,), 

where $(•) is the characteristic function of £'y 
Recall that for u in the neighborhood of 0, 

ln(l-M) = -M + /(ii) with /(w) = +o(w2). (20) 

Write p{u) = f{u)/u, then from (20), 

p{u)~o{u) asTi^-O. (21) 

And 



$(x) = 1 - ^x'Sx + .g(x) for x G M'"' and 



(22) 



g(x) = o(||x|| ) asx^O. 
For e > 0, there exists d{e) > such that 

|5(x)|<e||xf as 0<||x||<,5(e). (23) 

By (20) and (22), 

ln($(^/Ht's„■)) = ln(^l - ^s^tSt'sy + .9(^^t's, 

Therefore, the characteristic function of ^/nTn can be decomposed as 



(t) = exp I ^ ln($ ( v^t'sy ) ) I = exp I - ia„ + /3„ + rj,, | , 



(24) 
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where 



and 



Note that 



3 = 1 



.7 = 1 ^ ^ 



For am by (25), wc have 

a„ = tr[ ntSt'^SyS^,- ) = tr(nEt'(X-Xi)"H) 



By (12), 

hm an = vec(t)'(i?j"^ (g) E) vec(t). 

For /3„, by Lemma 4.1 and the continuity of t's^, for the 5(e) > in (23), there 
integer N{£) > such that for n > N{e), 

< II V"t's.y II < (5(e) for all j = 1, . . . ,n. 

Take n > N{e), then by (23) and (28), 

\g{y/nt'sij) \ < llV^t'sy ll^e. 

By (25), 

n n 

<^llV^t'syf£ = £n^tr(t's,js'y.t)=£tr(t'n(X,X,)-it). 
j=i i=i 

So by (12), limsup„_j.3^ |/3„| < etT{t'R^^t). Since e > is arbitrary, we obtain 

lim /3„-0. 
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And for 77,1, let 



Thus, by (29), 



^0 = ^(V^t'Sy)'l](Vnt'Sij) - g(V"t's,y). 



(32) 



Take n > N{e), by Lemma 4.1, the continuity of t'sy and (21), increasing N{e) if neces- 
sary, we may suppose that for all j, \p{Xj)\ < £■ Since f{Xj) =p{Xj)Xj, 



By (32), 



hn|<E 



2,2 



-tr(\/?is--tEt'V^Sij-) + llVntSy ll e' 



Then, taking the same operations as (26) and (30), we obtain the following inequality 

e 



\Vn 



< 



namely, by (12) 



r?„| < f tr E ^«^'tSt' + tr(t'i?rit). 



Due to arbitrary of e and (33), 

lim 77„ = 0. 

n— >-oo 

By (27), (31) and (34), we obtain from (24), 



1 



lim ^'„(t)=exp<j --vec(t)'(i?~'(g)i;)vec(t) 



(33) 



(34) 



(35) 



So by Levy's continuity theorem, ^/nTi converges in distribution to J^mixpi^, Ri ^ ® S), 
completing the proof of the desired result. □ 
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